-
Notifications
You must be signed in to change notification settings - Fork 39.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubelet: persist restart count of a container #6794
Conversation
if containerDone.Has(dockerContainerName) { | ||
continue | ||
} | ||
// We assume docker returns a list of containers sorted reversedly by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: sorted in reverse?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
ba18b6f
to
c41f70a
Compare
Uploaded a new patch to address the comments on readability and maintainability. FYI, the previous patch passed e2e. |
If a container is failing and restating rapidly, does this result in a lot On Tue, Apr 14, 2015 at 10:18 AM, Yu-Ju Hong notifications@github.com
|
@erictune it does result in a new status update per crash (this is true without this PR). It is rate-limited by the Kubelet's sync loop which only runs every 10s. |
Okay. SGTM. Presumably that 10s will get even less frequent if the master On Tue, Apr 14, 2015 at 10:32 AM, Victor Marmol notifications@github.com
|
Like @vmarmol mentioned, we update the status on crash with or without this PR. The sync loop frequency could be shorter than 10s if updates to other pods have triggered the new iteration. Overall, the syncing frequency is quite low for now. @erictune, there is also a QPS limiter for the REST client, which should be our safety net. |
Tested this with a local setup and saw the restart count incremented beyond 5:
|
@@ -245,6 +254,9 @@ func (self *DockerManager) GetPodStatus(pod *api.Pod) (*api.PodStatus, error) { | |||
return nil, err | |||
} | |||
|
|||
containerDone := util.NewStringSet() | |||
// Loop through the docker container list to construct the container |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a note that these are all the running and exited containers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Otherwise looks good, thanks @yujuhong! |
Currently, restart count are generated by examine dead docker containers, which are subject to background garbage collection. Therefore, the restart count is capped at 5 and can decrement if GC happens. This change leverages the container statuses recorded in the pod status as a reference point. If a container finished after the last observation, restart count is incremented on top of the last observed count. If container is created after last observation, but GC'd before the current observation time, kubelet would not be aware of the existence of such a container, and would not increase the restart count accordingly. However, the chance of this should be low, given that pod statuses are reported frequently. Also, the restart cound would still be increasing monotonically (with the exception of container insepct error).
c41f70a
to
fd34441
Compare
Addressed the comments. |
LGTM, thanks for the patience @yujuhong! |
Kubelet: persist restart count of a container
…rtCount docs wasn't updated as part of kubernetes#6794.
Currently, restart count are generated by examine dead docker containers, which
are subject to background garbage collection. Therefore, the restart count is
capped at 5 and can decrement if GC happens.
This change leverages the container statuses recorded in the pod status as a
reference point. If a container finished after the last observation, restart
count is incremented on top of the last observed count. If container is created
after last observation, but GC'd before the current observation time, kubelet
would not be aware of the existence of such a container, and would not increase
the restart count accordingly. However, the chance of this should be low, given
that pod statuses are reported frequently. Also, the restart cound would still
be increasing monotonically (with the exception of container insepct error).